A Goal-Directed Intermediate Level Executive for Image Interpretation
نویسندگان
چکیده
G O L D I E is a system that has been developed to provide the top-down control of the low and intermediate level processes that create and modify the intermediate-level descriptions of image data used in image interpretation. The basic control structure of GOLDIE is the schema, a declarative specification of control strategies invoked in response to a request (goal) for a particular form of intermediate-level data. Using this control paradigm, high-level interpretation processes gain the ability to create or refine the intermediate-level data according to predefined goals and/or current hypotheses during interpretation. A variety of schemas for tasks such as region segmentation, line extract ion, collinear line grouping, and line-based segmentat ion are currently implemented wi th in the GOLDIE system. 1 . T H E I N T E R M E D I A T E L E V E L O F I N T E R P R E T A T I O N Due to the complexity of the visual interpretation task and the inherent unreliabil ity and ambiguity of image data, completely bo t tom-up (i.e. data-directed) approaches to interpretation can not be expected to be generally effective. The local nature of the processes that extract descriptions of image events typically leads to intermediate level descriptions of the image data in which individual semantic objects from the scene are either broken into a number of pieces (fragmentation), or are merged wi th other objects in the scene (overmerging) (4,11). The use of expectations, or context, provided by the systems's knowledge-base and/or partial scene interpretation can be used to compensate for these factors to produce data descriptions that more closely match the semantic content of the scene. Such a methodology requires that control of processing be bi-direct ional, shifting between data-directed and goal-directed methods in an opportunistic manner [1,5,9,11]. This implies that most of the processes utilized by a vision system during the construction of an interpretation must be sensitive to high-level goals and constraints. This concept of bot tom-up and top-down control may be viewed in terms of the levels of abstraction used in the image interpretation process. Wi th in the VISIONS Image Understanding System (8,11), three such levels may be identified. Low-level processes are those which operate on pixel data to produce a set of intermediate-level symbolic tokens representing significant events contained in the raw image data (e.g. regions, lines, or surfaces). Intermediate-level grouping processes manipulate this in i t ia l set of tokens to produce new tokens, and high-level processes provide the semantic interpretation of the image based on the ful l set of image tokens. G O L D I E ( f i f i a l Directed Intermediate-Level Exectutive) [6,7] is a system wi th in the VISIONS environment that pro*Thls work was supported in part by the Air Force Office of Scientific Reiearch Grant AFOSR-86-0021 and the National Science Foundation Grant DCR.8318776 vides the mechanisms for the top-down control of the low and intermediate-level processes that create or modify tokens. The specification of a request for a particular type of image token is expressed as a goal, a data structure that defines the generic class of processing required (e.g. region segmentation, line extraction, grouping, etc.). Constraints, stored as attributes wi th in the goal data structure, express the desired characteristics of the tokens to be produced, and may be represented through either semantic or image-based criteria. Semantic constraints are defined in terms of semantic labels (e.g. "segment a specific portion of the image to separate Tree and Sky"), while the image-based constraints are expressed in terms of measurable image features (e.g. "produce regions that exhibit homogeneous texture measures"). The basic control structure of GOLDIE is the schema, a declarative specification of control strategies that may be used by the system to satisfy a specific goal. Schemas provide a flexible and extensible control structure that is currently used wi th in VISIONS to direct the high-level semantic interpretat ion processes [4,10,12]. GOLDIE extends this concept of schema-directed control to the intermediate and low levels of processing. The power of this system lies not in the particular set of specific low and intermediate-level processes that have been integrated into the current system, but rather in the various representations for knowledge, data, and control that implement goal-directed processing. 2 . G O A L D I R E C T E D C O N T R O L GOLDIE supports both goal-directed and data-directed control of processing. Data-directed processing is controlled by sets of evaluation rules which associate hypotheses (e.g. "candidate for resegmentation", "candidate for merge", etc.) wi th tokens; the hypotheses are used to direct the subsequent activities of the system. Current goal constraints are used to determine the specific set of rules used to compute the value of the hypothesis. For example, if a particular set of goal constraints indicated that uniformly textured regions were of primary importance, the rules used to establish the acceptability of region tokens would include measures of the homogeneity of both hue and short line density. Tokens representing large regions w i th high variance in either of these features would probably be considered unacceptable. Various types of intermediate-level hypotheses may be established in this manner. For example, the initialization schema, which provides a data-directed region segmentation for either the entire image (at system startup) or for an area defined by a (set of) region token(s), uses the rule mechanism to hypothesize whether individual region tokens should be merged or reaegmented. An instance of this schema first uses the region segmentation schema to select the image feature(s), segmentat ion algor i thm, and parameters which wi l l be used to produce an in i t ia l "best guess" for a set of region tokens. Using the set Kohl, Hanson, and Riseman 811 of rules selected according to the goal constraints, the ini t ia l ization schema instance then evaluates this set of tokens and continues to process those which have unacceptable values for F i g u r e 1 : O u t d o o r Scene F i g u r e 2 : I n i t i a l i s a t i o n Schema S e g m e n t a t i o n r eg i on -merg i ng -hypo thes i s or r eg i on resegmen ta t i on hypothe8iB As each of these tokens is processed, the schema is able to perform resegmentation or merging using algorithms and criteria appropriate to the characteristics of the particular region. This form of data-directed processing demonstrates intermediate level control that uses knowledge of region characteristics, as well as knowledge about the performance of the segmentation algorithms under varying conditions of the data; we call this non-semantic (or image-domain) knowledge Figure 2 shows the segmentation produced by this schema for the image from Figure 1. Al though this form of intermediate-level control provides image tokens that appear to correlate reasonably well wi th image content, this fact alone does not guarantee the suitabil i ty of the tokens for interpretation. The ultimate significance or quality of a particular token is dependent only upon its utility with respect to a set of interpretation goals and cannot be measured independent of the interpretation process. Therefore GOLDIE also supports goal-directed control in which a goal contains an explicit specification of the token characteristics which must be met by any potential set of results. This evaluation-constraint is expressed as a function value pair; the function is applied to each set of tokens produced by the contracting schema inF i g u r e 3 : R e c t a n g u l a r S h u t t e r Reg ions stance, and only if the return value of the function exceeds the value specified is the set of tokens judged to be potentially acceptable For example, if there has been an inference of the presence of a pair of shutters on the side of a house, an interpretation process might request a resegmentation of the area defined by that pair in an attempt to define tokens that could be unambiguously labeled as either window or shutter The goal specification for this request includes an evaluation-constraint that indicates a preference for region tokens that are rectangular Figure 3 shows the result of this goal specification over the three areas bounded by pairs of ini t ial ly hypothesized shutters in the original segmentat ion (Figure 2) The dark lines in this figure represent region boundaries which were constructed by GOLDIE in response to the definition of an "area of interest' by the interpretation process Each of these newly defined regions was then resegmented according to the specified goal. Note that although image noise, aliasing, and window reflections prevent a "perfect' segmentation of these areas, the resegmentation process has provided a set of tokens which are appropriate for interpretat ion of windows and shutters. The tokens produced through the satisfaction of this goal can be labeled and precisely located, thereby making the information generated by this particular interpretation process available to any other high-level processes that may be concerned wi th this data. Conversely, if the intent of the interpretation process is to identify trees in the image, the interpretation system could specify a merge goal having an evaluation-constraint that indicates a preference for large green textured areas (thereby producing the region in Figure 4). 3 . A R C H I T E C T U R E O F G O L D I E The G O L D I E system, represented by the modules wi th in the large dashed rectangle of Figure 5, may be described in terms of four major functional components: the process control ler, the data structures for intermediate-level tokens and hypotheses ( ISTM) , the representation of explicit intermediatelevel knowledge ( I L T M ) , and the representation of the control state of the system (Goal Blackboard and Schema Instantiat ions).
منابع مشابه
مداخلههای آموزشی کارکردهای اجرایی برای کودکان با نارسایی شناختی
“Executive functions” are used as an umbrella term that are broader than cognition and incorporates a range of interrelated processes responsible for goal-directed behavior. Most of the findings from interventional studies have confirmed the effective and sustained benefits of training executive functions in academic performance. Most of these studies, both working memory and attent...
متن کاملDirected Motivational Currents: The Implementation of the Dynamic Web-Based Persian Scale among Iranian EFL Learners
Directed motivational current (DMC) ̶ as a novel strand in L2 motivational field ̶ is a robust motivational drive fueled by a highly valued goal and capable of stimulating and sustaining long-term behavior. The present study explored English as a foreign language students’ Directed Motivational Currents by validating the dynamic web-based Persian version of the scale, finding the most crucial mot...
متن کاملData Fusion and Multi-Criteria Decision Making for Producing Oil and Gas Resources Potential Maps (Case Study: Saracheh Zone, Qom Province)
This paper focuses on the application of Geoinformatic methods (simultaneous using of remote sensing, geographic information system, global positioning system, terrestrial and aerial photogrammetry) in optimal operation and exploration risk reduction of oil and gas reservoirs. To approach the purpose, two aspects of remote sensing (satellite image) and terrestrial and aerial photogrammetry have...
متن کاملمقایسه اثربخشی موسیقیدرمانی با تصور هدایتشده و راهبردهای شناختی بر کاهش میزان اضطراب دانشآموزان
The present study compared the effectiveness of music therapy with directed image and cognitive strategies on reducing anxiety in high school students. Listen Read phonetically Dictionary-View detailed dictionary. In this study, it was hypothesized that music therapy with directed image is more effective than cognitive strategies approach. In order to test the research hypothesis, from all hi...
متن کاملUsing Cbr Learning for the Low-level and High- Level Unit of an Image Interpretation System
The existing image interpretation systems lack robustness and accuracy. They cannot adapt to changing environmental conditions and to new objects. The application of machine learning to image interpretation is the next logical step. Our proposed approach aims at the development of dedicated machine learning techniques at all levels of image interpretation in a systematic fashion. In the paper w...
متن کامل